The Questions
- What exactly is a T-SQL expression
- I want to display outputted numbers as currency, with commas and dollar signs. Is that possible in T-SQL?
- I see the plus sign (+) used in a variety of ways in T-SQL script, but I can’t always make sense of how SQL Server arrives at the results it does. Is there a “right way” to use plus signs?
- The NULLIF expression makes no sense. I would think you’d use it to somehow ferret out a NULL value or to test a condition for a NULL value. But it appears to merely return a NULL if two values are equal. Is that all it’s doing?
- Can you use expressions in data manipulation language (DML) statements?
- The CASE expression allows me to leave out the ELSE clause. Is that important to include it?
- Can you shed light on the order in which the database engine processes the components in an expression?
- I get confused on how to treat constants in my expressions. Is there a trick in using them?
- Are there any advantages to using a COALESCE expression rather than a CASE expression when checking for the first non-NULL value in a list of values?
- I’m working on a query that uses the NOT IN operator in the WHERE clause. The operator checks values in a list returned by a subquery. When the returned list includes a NULL value, the query returns an empty resultset, even though I know I should be seeing results. Any idea what might be happening?
- What the heck is a ‘modulo’?
- Are all WHEN and ELSE clauses in a CASE expression evaluated if the first WHEN clause evaluates to TRUE?
- What’s the difference between the two ‘not equal to’ operators (<> and !=)?
- I’ve come across operators such as += in variable assignment SET statements. What do they mean?
What exactly is a T-SQL expression?”
An expression is a block of code in a T-SQL statement that is processed as a unit in order to return a scalar (single) data value. Each expression is made up of one or more types of components, including constants, columns, variables, operators, scalar functions, and scalar subqueries. SQL Server supports the use of expressions within different types of T-SQL statements and within various parts of those statements. For example, you can define expressions in the SELECT
, WHERE
, GROUP
BY
, and ORDER
BY
clauses of a SELECT
statement.
The following T-SQL code contains a variety of expressions throughout the SET
and SELECT
statements:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
DECLARE @date DATETIME; SET @date = GETDATE() SELECT Name + ':' + SUBSTRING(ProductNumber, 4, 4) AS ProdName, ReorderPoint, CONVERT(INT, ROUND(ReorderPoint * 1.1, 0)) AS NewPoint, SafetyStockLevel - ReorderPoint AS PointDiff, CASE WHEN Color IS NULL THEN 'N/A' ELSE Color END AS ProdColor, DATEDIFF(mm, SellStartDate, @date) AS MonthDiff FROM Production.Product WHERE ProductID BETWEEN 316 AND 320; |
The first expression shows up in the SET
statement, after the equal sign. It is simply the GETDATE()
function. An expression is often made up of only a single element, such as a column, constant, or in this case, function.
Next we move to the SELECT
statement’s select list. The first expression we find defines the ProdName
column. The expression concatenates the Name
column with the last four digits of the ProductNumber
column. The expression includes two concatenation operators (+
), one literal string value (:
), and the SUBSTRING
scalar function, which extracts the last four digits from the ProductName
column.
The next expression in the select list is the ReorderPoint
column. The expression and name of the column are one and the same. This is followed by an expression that defines the NewPoint
column. The expression multiplies the ReorderPoint
value by 1.1
, uses the ROUND
function to round the result to a whole number, and then uses the CONVERT()
function to convert the value to an integer.
Next we have an expression that defines the PointDiff
column. In this case, the expression merely subtracts the ReorderPoint
value from the SafetyStockLevel
value, using the subtract operator to calculate the difference.
Our next expression defines the ProdColor
column. Here we use a CASE
expression to provide a default value should a NULL
be returned.
The final expression in the select list defines the MonthDiff
column. The expression uses the DATEDIFF
function to calculate the difference, in months, between the SellStartDate
value and the @date
variable value.
The next expression is in the search condition of the WHERE
clause. The expression compares the ProductID
value to the range of values defined by the BETWEEN
operator, the AND
operator, and the literal values 316
and 320
. You can use various operators to link expression elements together in this way, creating one large expression that is synthesized into a single unit that returns a scalar value. Note that the search condition can also be considered to be three expressions linked together by the BETWEEN
and the AND
operators: one column and two numerical constants.
Because of the expressions specified throughout the T-SQL code, our SELECT
statement returns the following results:
ProdName |
ReorderPoint |
NewPoint |
PointDiff |
ProdColor |
MonthDiff |
Blade:2036 |
600 |
660 |
200 |
N/A |
145 |
LL Crankarm:5965 |
375 |
413 |
125 |
Black |
145 |
ML Crankarm:6738 |
375 |
413 |
125 |
Black |
145 |
HL Crankarm:7457 |
375 |
413 |
125 |
Black |
145 |
Chainring Bolts:2903 |
750 |
825 |
250 |
Silver |
145 |
Clearly, expressions play an important role in a T-SQL statement. They can be as simple the ones shown here or far more complex.
What we’ve not covered, however, is the FROM
clause. In this case, the FROM
clause includes only the schema and table name. In some cases, the FROM
clause might include a table expression, such as a simple table-valued variable or an operand in an APPLY
clause. However, those types of expressions are usually specifically referred to as table expressions. When the term expression is used by itself, it’s normally referring to a scalar expression, such as those highlighted in the example above.
To know when and where you can use expressions, you should refer to the statement syntax in SQL Server Books Online. Not only will this show you where you can create expressions, but also whether there are any limitations on the expressions you create.
“I want to display outputted numbers as currency, with commas and dollar signs. Is that possible in T-SQL?”
Yes, it’s possible, but before we go into how that is done, be aware that such operations are generally best done by the application at the presentation level, rather than at the database level. Still, there are times when you are required to convert data into its representational form. So let’s look at how to format your currency.
Prior to SQL Server 2012, to get data into a currency format (at least for currencies with a structure similar to the US dollar), you could cast the value to the MONEY
type, if necessary, and then use the CONVERT
function to convert the value to a character type, as shown in the following example:
1 2 3 4 5 6 7 8 9 10 |
DECLARE @var1 NVARCHAR(20) = '555555555.5555', @var2 DECIMAL(20,4) = 555555555.5555, @var3 MONEY = 555555555.5555; SELECT '$' + CONVERT(NVARCHAR(20), CAST(@var1 AS MONEY), 1) AS amount1, '$' + CONVERT(NVARCHAR(20), CAST(@var2 AS MONEY), 1) AS amount2, '$' + CONVERT(NVARCHAR(20), @var3, 1) AS amount3; |
Each expression in the SELECT
list references one of the variables as its source data and is made up of multiple components. What provides the final format, however, is the CONVERT
function. The function requires that the source data be either the MONEY
or SMALLMONEY
type. When you call the function, you specify a second argument along with the source value. That argument determines the format of the data. In this case, we specify 1
so the outputted data includes commas every three digits to the left of the decimal point. We also concatenate the value with a dollar sign to give it that final touch, as shown in the following results:
amount1 |
amount2 |
amount3 |
$555,555,555.56 |
$555,555,555.56 |
$555,555,555.56 |
Overall, this isn’t too painful a process, unless you’re working with currencies that follow a different format, such as the Swiss franc, which takes apostrophes rather than commas. In such cases, you might have to add yet another function, such as REPLACE
, to substitute the commas with apostrophes.
Fortunately, since the release of SQL Server 2012, you’ve been able to instead use the FORMAT
function to output your numbers to a currency format. The FORMAT
function returns a value in a specified format, based on a specified culture. To demonstrate how this works, let’s first recast our preceding expressions into ones that use the FORMAT
function to configure the data as US dollar amounts:
1 2 3 4 5 6 7 8 9 10 |
DECLARE @var1 NVARCHAR(20) = '555555555.5555', @var2 DECIMAL(20,4) = 555555555.5555, @var3 MONEY = 555555555.5555; SELECT FORMAT(CAST(@var1 AS MONEY), 'c', 'en-us') AS amount1, FORMAT(@var2, 'c', 'en-us') AS amount2, FORMAT(@var3, 'c', 'en-us') AS amount3; |
Notice that we must first cast the character data (@var1
) to the MONEY
data type, but we can use the DECIMAL
variable (@var2
) as is. The second FORMAT
argument (c
) specifies that we’re formatting currency, and the third argument specifies the US symbol (en-us
) for the cultural context. The SELECT
statement returns the same results as the previous SELECT
statement, with all values returned as the NVARCHAR
type.
However, with the FORMAT
function, we can specify other cultures. For example, the following example uses the German symbol (de-de
) to provide the cultural context:
1 2 3 4 5 6 7 8 9 |
DECLARE @var1 NVARCHAR(20) = '555555555.5555', @var2 DECIMAL(20,4) = 555555555.5555, @var3 MONEY = 555555555.5555; SELECT FORMAT(CAST(@var1 AS MONEY), 'c', 'de-de') AS amount1, FORMAT(@var2, 'c', 'de-de') AS amount2, FORMAT(@var3, 'c', 'de-de') AS amount3; |
Now our results are specific to how the euro is represented in Germany, as shown in the following table:
amount1 |
amount2 |
amount3 |
555.555.555,56 ⬠|
555.555.555,56 ⬠|
555.555.555,56 ⬠|
Certainly, the FORMAT
function is a handy tool if you want your results to be displayed in a specific currency. But just to reiterate, such formatting is generally handled by the calling application, with the source data retrieved in its raw state.
“I see the plus sign (+) used in a variety of ways in T-SQL script, but I can’t always make sense of how SQL Server arrives at the results it does. Is there a “right way” to use plus signs?”
SQL Server is a tricky devil. As you’ve discovered, plus signs have all sorts of meaning in T-SQL. You can use them to concatenate string values, add numerical values together, or indicate that a numerical value is a positive number, rather than negative. The following SELECT
statement demonstrates how to both concatenate and add values:
1 2 3 4 5 6 7 |
SELECT FirstName + ' ' + LastName AS FullName, FORMAT(SalesLastYear, 'c', 'en-us') AS SalesLastYear, FORMAT(SalesYTD, 'c', 'en-us') AS SalesYTD, FORMAT(SalesLastYear + SalesYTD, 'c', 'en-us') AS TwoYearSales FROM Sales.vSalesPerson WHERE TerritoryName IS NOT NULL; |
For the FullName
column, we’ve used the plus sign to concatenate the FirstName
and LastName
columns with a space in between. For the TwoYearSales
column, we’ve used the plus sign to add the SalesLastYear
and SalesYTD
columns together. In both cases, the database engine knows when to concatenate the values or when to add them together, as shown in the following results:
FullName |
SalesLastYear |
SalesYTD |
TwoYearSales |
Michael Blythe |
$1,750,406.48 |
$3,763,178.18 |
$5,513,584.66 |
Linda Mitchell |
$1,439,156.03 |
$4,251,368.55 |
$5,690,524.58 |
Jillian Carson |
$1,997,186.20 |
$3,189,418.37 |
$5,186,604.57 |
Garrett Vargas |
$1,620,276.90 |
$1,453,719.47 |
$3,073,996.36 |
Tsvi Reiter |
$1,849,640.94 |
$2,315,185.61 |
$4,164,826.55 |
Pamela Ansman-Wolfe |
$1,927,059.18 |
$1,352,577.13 |
$3,279,636.31 |
Shu Ito |
$2,073,506.00 |
$2,458,535.62 |
$4,532,041.62 |
José Saraiva |
$2,038,234.65 |
$2,604,540.72 |
$4,642,775.37 |
David Campbell |
$1,371,635.32 |
$1,573,012.94 |
$2,944,648.25 |
Tete Mensa-Annan |
$0.00 |
$1,576,562.20 |
$1,576,562.20 |
Lynn Tsoflias |
$2,278,548.98 |
$1,421,810.92 |
$3,700,359.90 |
Rachel Valdez |
$1,307,949.79 |
$1,827,066.71 |
$3,135,016.50 |
Jae Pak |
$1,635,823.40 |
$4,116,871.23 |
$5,752,694.62 |
Ranjit Varkey Chudukatil |
$2,396,539.76 |
$3,121,616.32 |
$5,518,156.08 |
There’s no real mystery here. When we use the plus sign with only character data, the database engine concatenates the values. In this case, the plus sign is considered a string concatenation operator. When we use the plus sign with numerical values, the database engine adds the values together. Under these circumstances, the plus sign is considered an addition operator.
Not surprisingly, when we mix and match the types of data, our results are less predictable. Take a look at the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
DECLARE @var1 NVARCHAR(10) = 'TestVar', @var2 CHAR(2) = 'CX', @var3 SMALLINT = 300, @var4 DECIMAL(10,4) = 1000.0001, @var5 BIT = 1, @var6 MONEY = 999.99, @var7 DATETIME = '2014-07-25 00:00:00.000'; SELECT @var1 + @var2 AS var1var2, @var3 + @var6 AS var3var6, @var4 + @var5 AS var4var5, @var7 + @var3 AS var7var3, @var6 + @var7 AS var6var7; |
As you can see, we’ve declared seven variables, each defined with a different type. We then use the plus sign to combine the variables in different ways, giving us the following results:
var1var2 |
var3var6 |
var4var5 |
var7var3 |
var6var7 |
TestVarCX |
1299.99 |
1001.0001 |
2015-05-21 00:00:00.000 |
2017-04-19 23:45:36.000 |
As we would expect, when we pair @var1
with @var2
, both character types, we end up with a concatenated string. When we pair @var3
to @var6
, both numerical types, the values are added together, even though @var3
and @var6
are different types.
Now we get to our third pairing: @var4
+
@var5
. This time around, we’re adding a DECIMAL
value to BIT
value. However, the database engine still treats both variables as numerical values and adds them together, even though SQL Server documentation suggests that you can’t do this with the BIT
type. (And really, why would you?)
The next item in the select list, @var7
+
@var3
, pairs a DATETIME
value with a SMALLINT
value. This time, we get a date that is 300 days later than the original date. When your expression includes the plus sign, the database engine defaults to the data type with the highest precedence, in this case, DATETIME
. Based on the type, the engine determines whether to use the plus sign to concatenate values or add them together. If the data type with the highest precedence is a numeric data type, the engine attempts to add the values. In this case, the engine adds the SMALLINT
value to the DATETIME
value as the specified number of days. (A DATETIME
value is actually stored as two integers.)
Finally, we pair a MONEY
date type with the DATETIME
date type (@var6
+
@var7
). Once again, DATETIME
takes precedence over MONEY
, so the database engine adds 999.99 days to our original date to come up with a date in 2017.
We can verify how data type precedence works with plus signs by using the SQL_VARIANT_PROPERTY
function to retrieve the data type of the expressions that pair the different variables:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
DECLARE @var1 NVARCHAR(10) = 'TestVar', @var2 CHAR(2) = 'CX', @var3 SMALLINT = 300, @var4 DECIMAL(10,4) = 1000.0001, @var5 BIT = 1, @var6 MONEY = 999.99, @var7 DATETIME = '2014-07-25 00:00:00.000'; SELECT SQL_VARIANT_PROPERTY(@var1 + @var2, 'basetype') AS var1var2, SQL_VARIANT_PROPERTY(@var3 + @var6, 'basetype') AS var3var6, SQL_VARIANT_PROPERTY(@var4 + @var5, 'basetype') AS var4var5, SQL_VARIANT_PROPERTY(@var7 + @var3, 'basetype') AS var7var3, SQL_VARIANT_PROPERTY(@var6 + @var7, 'basetype') AS var6var7; |
The SELECT
statement returns the returned data type for each expression, as shown in the following table:
var1var2 |
var3var6 |
var4var5 |
var7var3 |
var6var7 |
nvarchar |
money |
decimal |
datetime |
datetime |
In each case, the expression returns the type with the highest precedence and determines how the plus sign will be used.
So far in our examples, the database engine has been able to add or concatenate values without a problem because the engine could implicitly convert the data type with the lowest precedence to the type with the highest. But this isn’t always the case:
1 2 3 4 5 6 |
DECLARE @var1 NVARCHAR(10) = 'TestVar', @var3 SMALLINT = 300; SELECT @var1 + @var3 AS var1var3; |
This time around, we’re using the plus sign to pair a NVARCHAR
value with a SMALLINT
value. Because SMALLINT
has precedence over NVARCHAR
, the database engine attempts to convert the value TestVar
to the SMALLINT
type, which of course is not possible. Consequently, the database engine returns the following error message:
1 2 |
Msg 245, Level 16, State 1, Line 5 Conversion failed when converting the nvarchar value 'TestVar' to data type smallint. |
The only way we can pair a numerical value with a string value is to concatenate them. In order to do so, we must explicitly convert the SMALLINT
value to a character data type:
1 2 3 4 5 |
DECLARE @var1 NVARCHAR(10) = 'TestVar', @var2 SMALLINT = 300; SELECT @var1 + CAST(@var2 AS NVARCHAR(10)); |
Now our SELECT
statement returns the value TestVar300
.
There’s one other use of the plus sign that you should be aware of-as a unary plus operator. SQL Server supports two types of unary operators, plus and minus, which are used to designate whether a numeric value is positive or negative:
1 2 3 4 5 |
DECLARE @var1 DECIMAL(10,2) = +123.45, @var2 DECIMAL(10,2) = -123.45; SELECT @var1 AS var1, @var2 AS var2; |
All we’ve done here is to designate our two literal values as positive and negative numbers. The SELECT
statement returns the following results:
var1 |
var2 |
123.45 |
-123.45 |
Be aware, however, that you cannot use the positive unary operator to convert a negative number to a positive number. For that, you need to use the ABS
mathematical function.
“The NULLIF expression makes no sense. I would think you’d use it to somehow ferret out a NULL value or to test a condition for a NULL value. But it appears to merely return a NULL if two values are equal. Is that all it’s doing?”
That’s part of what it’s doing. When a NULLIF
expression compares two values, it returns the first value if the two values are not equal. If they are equal, the expression returns a NULL
value of the same data type as the first specified value. Here’s what a NULLIF
expression looks like in action:
1 2 3 4 5 |
DECLARE @var1 VARCHAR(20) = 'valueA', @var2 VARCHAR(20) = 'valueB'; SELECT NULLIF(@var1, @var2); |
In this case, the NULLIF
expression returns valueA
because the two values are not equal. If they were equal, the returned value would be NULL
. In addition, the returned value would be NULL
if the first value were NULL
.
When you know how a NULLIF
expression works, it doesn’t seem so bad, although its use still has a tendency to cause confusion. Luckily, we can achieve the same results by using a CASE
statement:
1 2 3 4 5 6 7 8 9 |
DECLARE @var1 VARCHAR(20) = 'valueA', @var2 VARCHAR(20) = 'valueB'; SELECT CASE WHEN @var1 = @var2 THEN NULL ELSE @var1 END; |
The advantage of the CASE
statement is that it’s a lot easier to understand. Sure, NULLIF
provides simpler syntax, but if it creates confusion, what’s the point?
“Can you use expressions in data manipulation language (DML) statements?”
You can use expressions wherever T-SQL syntax permits them, including DML statements. Take a look at part of the syntax for the UPDATE
statement’s SET
clause:
1 |
SET { column_name = { expression | DEFAULT | NULL } |
According to SQL Server Books Online, the expression placeholder can be a “variable, literal value, expression, or a subselect statement (enclosed with parentheses) that returns a single value.” This pretty much defines the term expression as we understand it. Note, however, that any expression you define must conform to the rules associated with the particular T-SQL statement in which the expression is defined. For example, an UPDATE
statement’s UPDATE
clause can include the TOP
clause, as shown in the following syntax:
1 |
UPDATE [ TOP ( expression ) [ PERCENT ] ] |
In this case, expression “specifies the number or percent of rows that will be updated.” In other words, you cannot specify a non-numerical string value as your expression, as you can with other types of expressions. That said, it’s still considered an expression.
So, yes, you can use an expression in a DML statement. In fact, they’re used all the time. Let’s look at an example. Suppose we create the following table in our database and insert a row of data:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
IF OBJECT_ID('Inventory', 'U') IS NOT NULL DROP TABLE Inventory; GO CREATE TABLE Inventory ( ProductID INT NOT NULL IDENTITY PRIMARY KEY, InStock INT NOT NULL, OnOrder INT NOT NULL ); GO INSERT INTO Inventory(InStock, OnOrder) VALUES(20, 50); |
Now let’s use an UPDATE
statement to modify that row of data:
1 2 3 4 5 6 7 8 9 10 11 |
DECLARE @AmtReceived INT = 47; UPDATE Inventory SET InStock = InStock + @AmtReceived, OnOrder = OnOrder - @AmtReceived WHERE ProductID = 1; SELECT * FROM Inventory WHERE ProductID = 1; |
Our SET
clause is updating two columns. Each SET
definition includes an expression that either adds the @AmtReceived
value to the target column or subtracts that value. If we run the SELECT
statement after the table has been updated, we come up with the following results:
ProductID |
InStock |
OnOrder |
1 |
67 |
3 |
If you’re writing T-SQL statements, you’re writing expressions, even if you don’t realize it. That’s why it’s important to refer back to the statement’s syntax whenever you’re uncertain how to define a particular element. Expressions are supported throughout most statements. The syntax for the UPDATE
statement, for example, includes the expression placeholder in 10 different places, which suggests that many of us might not be using some statements to their fullest capacity.
“The CASE expression allows me to leave out the ELSE clause. Is it that important to include it?”
Yes, but…. The ELSE
clause specifies the value to return if no other comparison operations (WHEN...THEN
) evaluate to true. If you omit the ELSE
clause, the database engine instead returns a NULL
value. A NULL
value might be fine, if that’s what you want. But what if it’s not? Take a look at the following example, which uses a CASE
expression to determine how to return an employee’s full name:
1 2 3 4 5 6 7 8 9 10 |
SELECT BusinessEntityID AS EmployeeID, CASE WHEN MiddleName IS NULL THEN FirstName + ' ' + LastName WHEN LEN(MiddleName) = 1 THEN FirstName + ' ' + MiddleName + '. ' + LastName ELSE FirstName + ' ' + MiddleName + ' ' + LastName END AS FullName, TerritoryName AS Territory FROM Sales.vSalesPerson WHERE TerritoryName IN ('central', 'northeast', 'southeast'); |
If the MiddleName
column is NULL
, that column is not included in the expression. Only the first and last names are concatenated. If the MiddleName
value is an initial, a period is added to the initial and concatenated with the first and last names. Otherwise, all three names are concatenated, as shown in the following results:
EmployeeID |
FullName |
Territory |
275 |
Michael G. Blythe |
Northeast |
277 |
Jillian Carson |
Central |
279 |
Tsvi Michael Reiter |
Southeast |
Now let’s modify the CASE
expression by removing the ELSE
clause:
1 2 3 4 5 6 7 8 9 |
SELECT BusinessEntityID AS EmployeeID, CASE WHEN MiddleName IS NULL THEN FirstName + ' ' + LastName WHEN LEN(MiddleName) = 1 THEN FirstName + ' ' + MiddleName + '. ' + LastName END AS FullName, TerritoryName AS Territory FROM Sales.vSalesPerson WHERE TerritoryName IN ('central', 'northeast', 'southeast'); |
This time a NULL
is returned if the MiddleName
column has a value that is more than an initial, as the following table shows:
EmployeeID |
FullName |
Territory |
275 |
Michael G. Blythe |
Northeast |
277 |
Jillian Carson |
Central |
279 |
NULL |
Southeast |
Clearly, you would not want to delete the ELSE
clause in this instance. You could, however, add a WHEN...THEN
condition that replaces the ELSE
clause, in which case, you’re still returning NULL
of none of the conditions are met, which might be fine under some circumstances. Consider the following example:
1 2 3 4 5 6 7 8 |
SELECT FirstName + ' ' + LastName AS FullName, CASE WHEN SalesQuota > 250000 THEN 'true' WHEN SalesQuota <= 250000 THEN 'false' END AS HigherQuota FROM Sales.vSalesPerson WHERE CountryRegionName = 'United States'; |
If the SalesQuota
value is great than 250,000, a true
is returned. If the value is less than or equal to 250,000, we get a false
. However, if the column is NULL
, neither condition can be true, so the CASE
expression returns a NULL
, as shown in the following results:
FullName |
HigherQuota |
Stephen Jiang |
NULL |
Michael Blythe |
true |
Linda Mitchell |
false |
Jillian Carson |
false |
Tsvi Reiter |
true |
Pamela Ansman-Wolfe |
false |
Shu Ito |
false |
David Campbell |
false |
Tete Mensa-Annan |
true |
Syed Abbas |
NULL |
Amy Alberts |
NULL |
You can choose to include an ELSE
clause and then provide a default value, such as N/A
(or even NULL
), whether or not you think it’s necessary. In fact, many developers think you should never omit the ELSE
clause, even if you believe it is impossible for one of the WHEN...THEN
conditions not to be met. Someone might make a change to the database that affects the statement or your thinking about possible outcomes might not be correct.
“Can you shed light on the order in which the database engine processes the components in an expression?”
The way in which an expression’s elements are processed depends on several factors: their listed order, whether any subset of elements is enclosed in parentheses, and what operators are used to connect the various elements. Parentheses group elements together to ensure that those elements are processed as a unit before being incorporated into the rest of the expression. In addition, the expression as a whole adheres to the concept of operator precedence, in which the operator types determine the order in which the expression is evaluated. For example, the multiply and division arithmetic operators take precedence over the greater than (>
) and lesser than (<
) comparison operations, which take precedence over logical operators such as AND
or ANY
. (You can find details about operator precedence in the MSDN topic “Operator Precedence.”)
Let’s start with a few simple arithmetic-based expressions to give you a sense of how this works. The following example declares several INT
variables, assigns an expression to each one, and returns the values from all three:
1 2 3 4 5 6 7 8 |
DECLARE @num1 INT = 3 * 2 + 5 - 7; DECLARE @num2 INT = 3 * (2 + 5) - 7; DECLARE @num3 INT = 3 * (2 + 5 - 7); SELECT @num1 AS num1, @num2 AS num2, @num3 AS num3; |
All we’re doing in each expression is multiplying and adding and subtracting values. However, because we use parentheses in the second and third expressions, our results come up quite different:
num1 |
num2 |
num3 |
4 |
14 |
0 |
The @num1
expression is fairly straightforward. We multiple 3 by 2, add 5, and subtract 7, which gives us a total of 4.
However, in the @num2
expression, we enclose the 2
+
5
elements in parentheses, so they’re processed first, giving us a total of 7. The 7 is then multiplied by the first 3, giving us 21. From there, we subtract 7, giving us a total of 14.
In the @num3
expression, the parentheses enclose the 2
+
5
-
7
elements, which are processed before the other elements. Because this returns 0, 3 is multiplied by 0, giving us a final result of 0.
Now let’s look at another example, which again declares three variables and applies different expressions to them:
1 2 3 4 5 6 7 |
DECLARE @SalesLastYear INT = 2500; DECLARE @SalesYTD INT = 1500; DECLARE @SalesQuota INT = 3000; SELECT @SalesQuota * 2 - @SalesLastYear + @SalesYTD * 2 AS SalesDiff1, (@SalesQuota * 2) - (@SalesLastYear + (@SalesYTD * 2)) AS SalesDiff2; |
Here’s what are results look like:
SalesDiff1 |
SalesDiff2 |
6500 |
500 |
The first expression includes no parentheses. However, the multiply operator takes precedence over the addition and subtraction operators, so the multiplication is performed first. That means we first multiply the SalesQuota
value by 2 and the SalesYTD
value by 2, giving us the expression 6000
-
2500
+
3000
, which equals 6500.
The second expression does use parentheses. The first set of parentheses returns a value of 6000. The second set of parentheses actually contains a third set, and those are the elements evaluated first, before the outer set of elements. After they’re evaluated, that part of the expression enclosed in the second set of parentheses becomes @SalesLastYear
+
3000
, giving us a total of 5500. That 5500 is then subtracted from the 6000, which is how we end up with 500.
Now let’s look at an expression that incorporates character data as well as numerical data. In the following SELECT
statement, our WHERE
clause includes both the OR
and the AND
logical operators:
1 2 3 4 5 6 7 8 |
SELECT Name AS ProductName, StandardCost, ListPrice, ProductLine FROM Production.Product WHERE ListPrice > StandardCost * 2.5 AND ProductLine = 'r' OR ProductLine = 't'; |
The WHERE
clause search condition contains three predicates separated by the AND
operator and the OR
operator. In its current state, the search condition states that the ListPrice
value must be greater than 2.5 times the StandardCost
value and the ProductLine
value must equal r
or the ProductLine
value must equal t
. According to the rules of operator precedence, the database engine first evaluates the AND
logical operator and then the OR
logical operator. Because of the placement of the operators, the first two predicates must both evaluate to true or the third predicate must evaluate to true. In other words, all returned rows must meet the first two conditions or meet the third condition. As a result, the statement returns 59 rows that qualify. The following table shows part of the results:
ProductName |
StandardCost |
ListPrice |
ProductLine |
Touring Front Wheel |
96.7964 |
218.01 |
T |
Touring Rear Wheel |
108.7844 |
245.01 |
T |
Touring-Panniers, Large |
51.5625 |
125.00 |
T |
Road Bottle Cage |
3.3623 |
8.99 |
R |
Racing Socks, M |
3.3623 |
8.99 |
R |
Racing Socks, L |
3.3623 |
8.99 |
R |
HL Touring Frame – Yellow, 60 |
601.7437 |
1003.91 |
T |
LL Touring Frame – Yellow, 62 |
199.8519 |
333.42 |
T |
HL Touring Frame – Yellow, 46 |
601.7437 |
1003.91 |
T |
HL Touring Frame – Yellow, 50 |
601.7437 |
1003.91 |
T |
HL Touring Frame – Yellow, 54 |
601.7437 |
1003.91 |
T |
HL Touring Frame – Blue, 46 |
601.7437 |
1003.91 |
T |
HL Touring Frame – Blue, 50 |
601.7437 |
1003.91 |
T |
HL Touring Frame – Blue, 54 |
601.7437 |
1003.91 |
T |
Suppose what we were really after was to return rows that met the first condition and either the second or third condition. In this case, we would have to modify our search condition slightly:
1 2 3 4 5 6 7 8 |
SELECT Name AS ProductName, StandardCost, ListPrice, ProductLine FROM Production.Product WHERE ListPrice > StandardCost * 2.5 AND (ProductLine = 'r' OR ProductLine = 't'); |
Notice that the second and third predicates are now enclosed in parentheses. Consequently, the database engine will first evaluate them as a unit. That means for all rows returned, ListPrice
must be greater than 2.5 times the value of StandardCost
and ProductLine
must equal r
or t
. The following table shows all the results now returned by the query.
ProductName |
StandardCost |
ListPrice |
ProductLine |
Road Bottle Cage |
3.3623 |
8.99 |
R |
Racing Socks, M |
3.3623 |
8.99 |
R |
Racing Socks, L |
3.3623 |
8.99 |
R |
Road Tire Tube |
1.4923 |
3.99 |
R |
Touring Tire Tube |
1.8663 |
4.99 |
T |
LL Road Tire |
8.0373 |
21.49 |
R |
ML Road Tire |
9.3463 |
24.99 |
R |
HL Road Tire |
12.1924 |
32.60 |
R |
Touring Tire |
10.8423 |
28.99 |
T |
Notice that we now have only nine rows and that each row meets both conditions for the ListPrice
and ProductLine
columns. By applying the parentheses, we have better controlled how the rules of operator precedence are applied.
“I get confused on how to treat constants in my expressions. Is there a trick in using them?”
A constant is a literal value that is either part of an expression or is the entire expression. The way in which you treat constants depends on the type of data. As a general rule, you enclose character data in single quotes, but leave numerical data without quotes, although it’s not quite as straightforward as this. Before we go into the oddities, though, let’s look at a few examples of character constants. The following T-SQL declares several character variables and then retrieves their values:
1 2 3 4 5 6 7 8 9 10 11 |
DECLARE @var1 VARCHAR(20) = 'some value', @var2 VARCHAR(20) = '', @var3 NVARCHAR(20) = N'some value + 1', @var4 NVARCHAR(20) = N'12345'; SELECT @var1 AS var1, @var2 AS var2, @var3 AS var3, @var4 AS var4; |
By default, when you’re working with character string constants, you enclose them in singe quotes. If you’re working with Unicode strings, you precede the opening quote with an uppercase N. Anything enclosed in the quotes is considered part of that string, even if the characters are numbers, operators, or other types of symbols, as shown in the following results:
var1 |
var2 |
var3 |
var4 |
some value |
some value + 1 |
12345 |
In SQL Server, we can instead enclose character strings in double quotes by setting the QUOTED_IDENTIFIER
option to OFF
, as shown in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
SET QUOTED_IDENTIFIER OFF; GO DECLARE @var1 VARCHAR(20) = "its value", @var2 VARCHAR(20) = "it's a value", @var3 VARCHAR(20) = 'it''s a value'; SELECT @var1 AS var1, @var2 AS var2, @var3 AS var3; SET QUOTED_IDENTIFIER ON; GO |
When the QUOTED_IDENTIFIER
option is set to OFF
, we can use double quotes just like single quotes, as is the case with the @var1
declaration. The advantage of using double quotes is that you can pass in a special character, such as an apostrophe, without having to escape it by adding a second apostrophe, as we had to do for @var3
. The SELECT
statement now returns the results shown in the following table:
var1 |
var2 |
Var3 |
its value |
it’s a value |
it’s a value |
Be aware, however, that the QUOTED_IDENTIFIER
option has other implications besides being able to use double quotes for string constants. For example, the option affects the way identifiers are handled in T-SQL. In addition, the SQL Server Native Client Provider and ODBC driver set this option to ON
, which can impact your code if you had set the option to OFF
. For these reasons you’re usually better off leaving the option set to ON
.
As previously mentioned, you typically don’t enclose numerical constants in quotes, even if they include currency symbols or scientific notation. For example, the following T-SQL declares four types of numerical variables, sets their values, and retrieves those values:
1 2 3 4 5 6 7 8 9 10 11 |
DECLARE @var1 INT = 147, @var2 DECIMAL(10, 4) = 10.58, @var3 MONEY = $68.24, @var4 FLOAT = 87.9e; SELECT @var1 AS var1, @var2 AS var2, @var3 AS var3, @var4 AS var4; |
If we had enclosed the constants in quotes, the database engine would have interpreted the values as character data and have tried to convert it to the type of the applicable variable. For the first three variables, an implicit conversion would have worked fine, but it would mean unnecessary work on the part of the database engine. For the last variable, the engine would have returned the following error:
1 2 |
Msg 8114, Level 16, State 5, Line 2 Error converting data type varchar to float. |
When we specify a numerical constant without quotes, the database engine assumes a numerical type consistent with the value. For example, the value 147
is assumed to be an INT
, so no implicit conversion is necessary. The following table shows the results returned by the SELECT
statement.
var1 |
var2 |
var3 |
var4 |
147 |
10.5800 |
68.24 |
87.9 |
Even with numeric constants, an implicit conversion might be called for. In the following SELECT
statement, the constant 1000
is compared the value in the ListPrice
column, which is configured with the MONEY
data type:
1 2 3 4 5 6 |
SELECT Name AS ProductName, ListPrice FROM Production.Product WHERE ListPrice > 1000 AND Color = 'blue'; |
The 1000
is treated as an INT
value, so it must be implicitly converted to the MONEY
type. But that’s not all what’s going on in the WHERE
clause. Notice that it also includes a string constant (blue
). The constant is compared to the Color
column, which is configured with the NVARCHAR
data type. This means that the database engine must convert the VARCHAR
string to the NVARCHAR
type, which gives us the following results:
ProductName |
ListPrice |
HL Touring Frame – Blue, 46 |
1003.91 |
HL Touring Frame – Blue, 50 |
1003.91 |
HL Touring Frame – Blue, 54 |
1003.91 |
HL Touring Frame – Blue, 60 |
1003.91 |
Touring-2000 Blue, 60 |
1214.85 |
Touring-1000 Blue, 46 |
2384.07 |
Touring-1000 Blue, 50 |
2384.07 |
Touring-1000 Blue, 54 |
2384.07 |
Touring-1000 Blue, 60 |
2384.07 |
Touring-2000 Blue, 46 |
1214.85 |
Touring-2000 Blue, 50 |
1214.85 |
Touring-2000 Blue, 54 |
1214.85 |
Like string constants, date and time values are also enclosed in single quotes, even though the values are stored as integers, as you can see in the following example:
1 2 3 4 5 6 7 8 9 10 11 |
DECLARE @var1 DATETIME = 'August 1, 2014', @var2 DATETIME = '2014-07-28 16:24:27.293', @var3 DATE = '7/30/2014', @var4 TIME = '14:05:55.001'; SELECT @var1 AS var1, @var2 AS var2, @var3 AS var3, @var4 AS var4; |
The date and time data types will accept various formats when passing in a constant, as long as the value is enclosed in quotes. The following table shows the results returned by the SELECT
statement:
var1 |
var2 |
var3 |
var4 |
2014-08-01 00:00:00.000 |
2014-07-28 16:24:27.293 |
2014-07-30 |
14:05:55.0010000 |
You should also be aware of how to treat constants when working with some of the other data types. For example, a VARBINARY
value takes an 0x
prefix but is not enclosed in quotes. The same goes for a BIT
value. However, a UNIQUEIDENTIFIER
value is enclosed in quotes, as shown in the following T-SQL:
1 2 3 4 5 6 7 8 9 |
DECLARE @var1 VARBINARY(20) = 0x14Ad, @var2 BIT = 1, @var3 UNIQUEIDENTIFIER = 'CC421A37-E462-4AE0-8451-38F837FC5A1A'; SELECT @var1 AS var1, @var2 AS var2, @var3 AS var3; |
Our SELECT
statement now returns the following results:
var1 |
var2 |
var3 |
0x14AD |
1 |
CC421A37-E462-4AE0-8451-38F837FC5A1A |
Again, we generally enclose string values in quotes and leave the quotes off for numerical values, but as you’ve seen, some areas can be somewhat gray. If you get confused, check out the MSDN topic “Constants.”
“Are there any advantages to using a COALESCE expression rather than a CASE expression when checking for the first non-NULL value in a list of values?”
A COALESCE
expression is a syntactic shortcut for a CASE
expression. That means, when the query optimizer gets ahold of a COALESCE
expression, it rewrites it as a CASE
one, so in that sense, they’re one in the same. Let’s look at a couple of examples. In the first one, we use COALESCE
to return the first column that is not a NULL
value:
1 2 3 4 5 6 7 8 |
SELECT Name AS ProductName, Size, SizeUnitMeasureCode AS SizeUnit, WeightUnitMeasureCode AS WeightUnit, COALESCE(Size, SizeUnitMeasureCode, WeightUnitMeasureCode) AS FirstNotNull FROM Production.Product WHERE ProductID = 828; |
For this particular product, the Size
column and SizeUnitMeasureCode
column each contains a NULL
value, but the WeightUnit
column contains an actual value, G
, so the COALESCE
expression returns that value, as shown in the following results:
ProductName |
Size |
SizeUnit |
WeightUnit |
FirstNotNull |
HL Road Rear Wheel |
NULL |
NULL |
G |
G |
Now let’s look at what happens when we change the COALESCE
expression to a CASE
expression:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
SELECT Name AS ProductName, Size, SizeUnitMeasureCode AS SizeUnit, WeightUnitMeasureCode AS WeightUnit, CASE WHEN Size IS NOT NULL THEN Size WHEN SizeUnitMeasureCode IS NOT NULL THEN SizeUnitMeasureCode WHEN WeightUnitMeasureCode IS NOT NULL THEN WeightUnitMeasureCode ELSE NULL END AS FirstNotNull FROM Production.Product WHERE ProductID = 828; |
The SELECT
statement returns the same results as the previous example. That said, the COALESCE
expression is a lot simpler, although the logic of the CASE
statement is easier to understand. In some cases, you might find that it’s better to use CASE
so your intent is clear to other developers.
One other consideration with COALESCE
. Developers sometimes run into problems with COALESCE
expressions because of data type issues. The value returned by COALESCE
is based on the data type of the input value with the highest precedence, which might or might not be the returned value. If an implicit conversion is required, and such a conversion is not permitted, COALESCE
will return an error. See the article “Questions about SQL Server Data Types You were Too Shy to Ask” for more details about this particular issue.
All that said, figuring out the logic behind COALESCE
is not too difficult, so choosing between CASE
and COALESCE
often comes down to nothing more than personal preference.
“I’m working on a query that uses the NOT IN operator in the WHERE clause. The operator checks values in a list returned by a subquery. When the returned list includes a NULL value, the query returns an empty resultset, even though I know I should be seeing results. Any idea what might be happening?”
Using NOT
IN
can be tricky, especially when NULL
values are involved. Case in point. The following SELECT
statement includes a WHERE
clause that checks for specific colors:
1 2 3 4 5 6 |
SELECT Name AS ProductName, Color FROM Production.Product WHERE FinishedGoodsFlag = 1 AND Color NOT IN (SELECT DISTINCT Color FROM Production.Product WHERE FinishedGoodsFlag = 0); |
Basically, what we’re doing is returning all products that are saleable items (FinishedGoodsFlag
=
1
) and whose color is not one of the colors of the non-saleable products. Unfortunately, when we run this query we receive an empty resultset.
If you’re familiar with the data, you might be surprised by these results. After all, the table contains 295 rows of saleable products. Either there are no colors unique to saleable items, or something is wrong.
We can investigate the problem by first running the subquery in the example above separate from the outer query:
1 2 3 |
SELECT DISTINCT Color FROM Production.Product WHERE FinishedGoodsFlag = 0; |
The following table shows the subquery’s results, which includes three distinct items:
Color |
NULL |
Black |
Silver |
As you can see, we have black and silver non-saleable items, along with those items for which no color has been assigned (the NULL
value). Now let’s modify the subquery to instead retrieve the colors of the orderable products:
1 2 3 |
SELECT DISTINCT Color FROM Production.Product WHERE FinishedGoodsFlag = 1; |
This time around, we again receive the same three values, along with another seven colors, as shown in the following table:
Color |
NULL |
Black |
Blue |
Grey |
Multi |
Red |
Silver |
Silver/Black |
White |
Yellow |
Clearly something is wrong. Rows that contain those seven additional colors should have been included in our original query results. The problem, it turns out, is the NULL
value. Let’s recast the logic of the WHERE
clause into several separate predicates, connected by the AND
logical operator:
1 2 3 4 5 6 |
SELECT Name AS ProductName, Color FROM Production.Product WHERE FinishedGoodsFlag = 1 AND Color <> 'Black' AND Color <> 'Silver' AND Color <> NULL; |
Essentially, our NOT
IN
operator indicates that the value in the Color
column should not be Black
and should not be Silver
and should not be NULL
, in addition to being a saleable product. The problem is with the equation Color
<>
NULL
. For a row to be returned, all conditions must evaluate to true; however, a NULL
cannot evaluate to true or false, which means the condition as a whole will never evaluate to true and no rows will ever be returned. If we were to run this statement, we would once again receive an empty resultset.
One way around this predicament is to add a second predicate to the WHERE
clause in our subquery:
1 2 3 4 5 6 |
SELECT Name AS ProductName, Color FROM Production.Product WHERE FinishedGoodsFlag = 1 AND Color NOT IN (SELECT DISTINCT Color FROM Production.Product WHERE FinishedGoodsFlag = 0 AND Color IS NOT NULL); |
Now our subquery returns only the values Black
and Silver
, which means our outer query should return all rows except those whose Color
value is one of the two colors. The following table shows part of the results, although the query actually returns 120 rows:
ProductName |
Color |
HL Road Frame – Red, 58 |
Red |
Sport-100 Helmet, Red |
Red |
Mountain Bike Socks, M |
White |
Mountain Bike Socks, L |
White |
Sport-100 Helmet, Blue |
Blue |
AWC Logo Cap |
Multi |
Long-Sleeve Logo Jersey, S |
Multi |
Long-Sleeve Logo Jersey, M |
Multi |
Long-Sleeve Logo Jersey, L |
Multi |
Long-Sleeve Logo Jersey, XL |
Multi |
HL Road Frame – Red, 62 |
Red |
HL Road Frame – Red, 44 |
Red |
HL Road Frame – Red, 48 |
Red |
If you were to run this query, you would see the Color
values black
and silver
are not included. However, what’s odd about this is that rows with NULL
for a color are not included either, even though NULL
is no longer being returned by the subquery. This might be fine in some cases, but in others you might want to see those values. Before we start looking for a workaround, however, another consideration is performance. Many SQL Server folks recommend using an EXISTS
operator rather than an IN
operator anyway. So let’s rewrite the last statement using NOT
EXISTS
:
1 2 3 4 5 6 |
SELECT Name AS ProductName, Color FROM Production.Product a WHERE a.FinishedGoodsFlag = 1 AND Color IS NOT NULL AND NOT EXISTS (SELECT * FROM Production.Product b WHERE FinishedGoodsFlag = 0 AND a.Color = b.Color); |
Our WHERE
clause now specifies that we include only saleable products, that they be a known color, and that the color not exist within the list of non-orderable colors. As a result, the SELECT
statement returns the same results as the preceding one, 120 rows. And if we want to include the saleable products with a Color
value of NULL
, we can simply remove the IS
NOT
NULL
expression:
1 2 3 4 5 6 |
SELECT Name AS ProductName, Color FROM Production.Product a WHERE a.FinishedGoodsFlag = 1 AND NOT EXISTS (SELECT * FROM Production.Product b WHERE FinishedGoodsFlag = 0 AND a.Color = b.Color); |
Our statement now returns 170 rows, which include those rows with a Color
value of NULL
. In theory, this statement should perform better than the ones using NOT
NULL
, but you might want to test that part out yourself to be sure. Just know that using NOT
IN
with NULL
values can return unexpected results, so be prepared.
“What the heck is a ‘modulo’?”
Remember when you first learned long division? You had the dividend, the divisor, and the remainder. The dividend is the number you start with. The divisor is the number you divide into the dividend. The remainder is what’s left. Modulo is a mathematical operator that returns the remainder.
Here’s how it works. In the follow example, @var1
serves as our dividend, and @var2
serves as our divisor:
1 2 3 4 5 6 7 |
DECLARE @var1 INT = 32, @var2 INT = 8; SELECT @var1 / @var2 AS Division, @var1 % @var2 AS Remainder; |
The first expression in our SELECT
clause uses the division operator (/
) to divide @var1
by @var2
. The second expression uses the modulo operator (%
) to calculate the remainder that’s produced when you divide these two numbers. The following table shows the results.
Division |
Remainder |
4 |
0 |
No surprise here. When you divide 32 by 8, you get 4, with a remainder of 0. But now let’s produce a remainder other than 0:
1 2 3 4 5 6 7 |
DECLARE @var1 INT = 36, @var2 INT = 8; SELECT @var1 / @var2 AS Division, @var1 % @var2 AS Remainder; |
This time around, we simply set @var1
to equal 36, rather than 32. Now our results show a remainder of 4.
That’s all there is to using the modulo operator. And you can use it with other numeric types:
1 2 3 4 5 6 7 |
DECLARE @var1 DECIMAL(10,4) = 55.556, @var2 INT = 5; SELECT @var1 / @var2 AS Division, @var1 % @var2 AS Remainder; |
In this case, we’re dividing 55.566 by 5 to come up with the following results:
Division |
Remainder |
11.111200000000000 |
0.5560 |
As you can see, the expression that contains the modulo operator returns a value of 0.5560
. As for the expression that contains the division operator, the operator’s returned value is based on the argument type with the highest precedence, which in this case is DECIMAL
.
“Are all WHEN and ELSE clauses in a CASE expression evaluated if the first WHEN clause evaluates to TRUE?”
According to the SQL Server documentation, a CASE
expression evaluates each condition sequentially and stops with the first condition that evaluates to true. In other words, the expression will short circuit when a condition is met, saving the database engine from having to perform unnecessary processing. This is a good thing. The less work the engine has to perform, the better.
To get a sense of how this works, let’s look at an example. The following SELECT
statement includes a CASE
expression that sets up three possible conditions when concatenating the name-related columns in the vSalesPerson
view:
1 2 3 4 5 6 7 8 9 10 |
SELECT BusinessEntityID AS EmployeeID, CASE WHEN MiddleName IS NULL THEN FirstName + ' ' + LastName WHEN LEN(MiddleName) = 1 THEN FirstName + ' ' + MiddleName + '. ' + LastName ELSE FirstName + ' ' + MiddleName + ' ' + LastName END AS FullName, TerritoryName AS Territory FROM Sales.vSalesPerson WHERE TerritoryName IN ('central', 'northeast', 'southeast'); |
The first CASE
condition checks whether the MiddleName
value is NULL
. If it is, the FirstName
and LastName
columns are concatenated and the expression stops running. If the column contains a value other than NULL
, the expression moves on to the next condition, which checks the length of the MiddleName
value. If it equals 1
, the three columns are concatenated and a period is added to the middle initial. The expression will then stop at that point. Otherwise, the ELSE
condition is applied and all three columns are concatenated as is. The database engine repeats the process for each row and returns the following results:
EmployeeID |
FullName |
Territory |
275 |
Michael G. Blythe |
Northeast |
277 |
Jillian Carson |
Central |
279 |
Tsvi Michael Reiter |
Southeast |
The CASE
expression’s ability to short circuit can help to avoid unnecessary processing when it can be substituted for other statement elements. For example, suppose we define the following SELECT
statement:
1 2 3 4 5 6 |
SELECT Name AS ProductName, Color FROM Production.Product WHERE FinishedGoodsFlag = 1 AND (Color = 'black' OR color = 'red'); |
Nothing flashy here. For each row, the database engine tests for the three specified conditions and determines whether that row is returned. Consequently, the resultset includes any saleable product that is either black or red, which means 127 rows evaluate to true. Part of those results are shown in the following table:
ProductName |
Color |
HL Road Frame – Black, 58 |
Black |
HL Road Frame – Red, 58 |
Red |
Sport-100 Helmet, Red |
Red |
Sport-100 Helmet, Black |
Black |
HL Road Frame – Red, 62 |
Red |
HL Road Frame – Red, 44 |
Red |
HL Road Frame – Red, 48 |
Red |
HL Road Frame – Red, 52 |
Red |
HL Road Frame – Red, 56 |
Red |
LL Road Frame – Black, 58 |
Black |
LL Road Frame – Black, 60 |
Black |
LL Road Frame – Black, 62 |
Black |
Suppose we want to reduce some of the work that the database engine has to perform. We can instead use a CASE
expression in the WHERE
clause, as shown in the following SELECT
statement:
1 2 3 4 5 6 7 8 9 |
SELECT Name AS ProductName, Color FROM Production.Product WHERE FinishedGoodsFlag = 1 AND Color = CASE Color WHEN 'black' THEN Color WHEN 'red' THEN Color END; |
If the Color
value equals black
, the condition evaluates to true and the expression stops running, thus avoiding any unnecessary processing. Yet the statement returns the same results as the preceding example.
Under the right circumstances, using a CASE
expression in this way can be a handy way to reduce your workloads. However, a CASE
expression might not always short circuit in the way you hope. For example, suppose your T-SQL includes logic is similar to that shown in the following example:
1 2 3 4 5 6 7 |
DECLARE @var1 INT = 0; SELECT CASE WHEN @var1 = 0 THEN 0 ELSE MAX(1/0) END; |
All we’re doing is declaring an INT
variable, setting its value to 0
, and using a CASE
expression to retrieve a specific value. Based on our understanding of the CASE
expression’s ability to short circuit, we would expect the expression’s first condition to evaluate to true and the expression to then stop. However, what we get from our SELECT
statement is the following error:
1 2 |
Msg 8134, Level 16, State 1, Line 3 Divide by zero error encountered. |
The error shows that the ELSE
condition did in fact run, even though it shouldn’t have. It turns out that under certain circumstances, a CASE
expression doesn’t behave like it’s supposed to behave. The problem appears to be related to aggregate functions (at least in part), so be aware that in some cases a CASE
expression might not deliver the benefits you’re hoping for and you might have to come up with a workaround to make the CASE
expression work the way you want it to.
“What’s the difference between the two ‘not equal to’ operators (<> and !=)?”
In T-SQL, the two comparisons operators are syntactically equivalent. They each perform a Boolean comparison between two expressions. If those expressions return non-NULL
, unequal values, the comparison evaluates to true, otherwise, it evaluates to false. However, if either value is NULL
, the evaluation returns a NULL
. Here’s a sample of the two operators in action:
1 2 3 4 5 |
DECLARE @var1 INT = 1; SELECT 'not equal' WHERE @var1 <> 2; SELECT 'not equal' WHERE @var1 != 2; |
The variable is set to a value of 1
, and in each SELECT
statement, that variable is compared to the constant 2
. As a result, each SELECT
statement returns the value not
equal
because each comparison evaluates to true. Even in terms of performance, there doesn’t appear to be much difference between the two.
However, there is one difference you should be aware of. The <>
operator conforms to ANSI standards; the !=
operator does not, despite that fact that a number of relational database systems support both operators. That said, even if there is only the slightest chance that you might port your script to another system, you’re better off sticking with the <>
operator.
“I’ve come across operators such as += in variable assignment SET statements. What do they mean?”
The add-equals operator (+=
) is a compound operator, one of several types of compound operators supported in SQL Server. A compound operator takes the original value and in some way amends it, rather than replacing it. For example, the following T-SQL uses the add-equals operator to update the value of the @var1
variable:
1 2 3 4 5 |
DECLARE @var1 INT = 2 SET @var1 += 1; SELECT @var1; |
We start by declaring @var1
and setting its value to 2
. We then use a SET
statement to add a value of 1
to the original value. The SELECT
statement returns the new value, which is 3
.
The add-equals operator makes it possible to add a value without having to specify the original value. Without the add-equals operator we would have to rewrite our SET
statement as follows:
1 2 3 4 5 |
DECLARE @var1 INT = 2 SET @var1 = @var1 + 1; SELECT @var1; |
The add-equals operator merely provides the shorthand necessary to avoid having to explicitly specify the original value, yet the SELECT
statement still returns a value of 3
. Plus, SQL Server supports other compound operators as well. For example, the following T-SQL subtracts 1
from the original value:
1 2 3 4 5 |
DECLARE @var1 INT = 2 SET @var1 -= 1; SELECT @var1; |
In this case, we’re using the subtract-equals (-=
) operator in our SELECT
statement, rather than the add-equals operator. The subtract-equals operator works the same way as the add-equals operator, except that the second value is subtracted rather than added. As a result, our SELECT
statement now returns a value of 1
.
Another example of a compound operator is the multiply-equals operator, which is shown in the following example:
1 2 3 4 5 |
DECLARE @var1 INT = 2 SET @var1 *= 3; SELECT @var1; |
In this case, the SET
statement uses the multiply-equals operator to multiple the variable value 2
by the constant value 3
, giving us a total of 6
. And compound operators are not limited to mathematical expressions. For example, the following T-SQL includes a SET
statement that uses the add-equals operator to concatenate values:
1 2 3 4 5 |
DECLARE @var1 VARCHAR(30) = 'some value' SET @var1 += ' and then some'; SELECT @var1; |
In this example, the SET
statement adds the string value and
then
some
(including the opening space) to the variable value some
value
, giving us the value some
value
and
then
some
. There are other compound operators as well, such as divide-equals (/=
) and modulo-equals (%=
). Check out the MSDN topic “Compound Operators” for more information.
Load comments